NOTICE: This material may be protected 
by copyright law (Title 1 7 U.S. Co de) 

' similarity with HIV-l 



TIBS 21 -MAY 1996 



RT 



is limited to 
of the palm 




middle add pdysniieMes 



Rui Sousa 



A superfamily of nucleic acid polymerases that includes the pol I and pol a 
classes of DNA-directed DNA polymerases, mitochondrial and phage DNA- 
directed RNA polymerases, and most RNA<lirected polymerases may be 
defined on the basis of the occurrence of conserved sequence motifs and 
tertiary structure similarities between HIV-l reverse transcriptase, DNA 
polymerase I and T7 RNA polymerase. Although sequence or structural 
similarities do not yet justify inclusion of the multi-subunit DNA-directed 
RNA polymerases in this superfamily, mechanistic similarities suggest a 
deep relationship between these and the simpler T7-like RNA polymerases. 



THE TERTIARY STRUCTURE of a 
template-directed nucleic acid polym- 
erase, that of the Klenow fragment of 
DNA polymerase I (DNAP I), was first 
described in 1985 (Ref. 1). Seven years 
were to pass before another polymerase 
structure appeared in the literature. 
During this period, development of the 
field depended largely on structure- 
function studies and on the identification 
of conserved sequence motifs among 
the increasing number of known polym- 
erase sequences. These efforts culmi- 
nated in an alignment that included 
most DNA-directed DNA polymerases 
(DNAPs), reverse transcriptases (RTs), 
RNA-directed RNA polymerases (RNAPs) 
and DNA-directed RNAPs (Ref. 2). How- 
ever, the tenuous nature of many of 
these sequence similarities cast doubt 
on the entire scheme, which therefore 
awaited confirmation and refinement, 
or rejection, based on further structure- 
function or structural studies. The past 
few years have seen the emergence of 
the structures of three new polym- 
erases 3-9 , which now make it possible to 
evaluate the significance of these pat- 
terns of apparent sequence conservatioa 



Poch et ai 10 and Mendez et ai n can be 
seen in Fig. 1. Although more extensive 
patterns of sequence similarity within 
certain polymerase families have been 
identified, we focus here on the limited 
set of motifs that are most widely dis- 
tributed. It can be seen that there is a 
correlation between polymerase tem- 
plate or substrate specificity and motif 
conservation. For example, motifs 
T/DxxGR and B are found only in polym- 
erases that use DNA templates, while 
motifs B' and D are restricted to polym- 
erases that use RNA templates and, 
within the RNA-directed family of 
polymerases, motif E is restricted to 
polymerases that use dNTPs. Motifs A 
and C unify the RNA- and DNA-directed 
RNA or DNA polymerases because they 
occur in polymerases of either template 
or substrate specificity. 



Motif conservation 

The pattern of polymerase motif con- 
servation identified by Delarue et ai 2 y 
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Relating sequence motifs to structure 

It is instructive to examine the over- 
all structures of these enzymes before 
looking at where the sequence motifs 
occur within them (Fig. 2). The similar- 
ity in the shape of the polymerase 
domains of T7 RNAP, p66 HIV-l RT, and 
DNAP I to a 'cupped right hand' has led 
to the designation of the three sub- 
domains of the polymerase domain as 
4 fingers\ 'palm', and 'thumb' 4 . The most 
extensive similarity is seen between T7 
RNAP and DNAP I : the folding and a^ 
most all of the secondary structure in 
their respective polymerase domains is 
nearly identical, while the structural 

Copyright © 1996, Elsevier Science Ltd. Ail rights reserved. O968-0004/96/SJ5.O0 



a core comprising most 
subdomain. 

IVIotife A and C. Peering more deeply 
into the large template-binding clefts 
of these enzymes, we can localize ac- 
tive sites that have been defined by 
structure-function studies, structural 
studies of polymerase-substrate/template 
complexes, and sequence comparison 
(Fig. 3). Most of the residues forming 
these active sites are part of the se- 
quence motifs shown in Fig. L Motifs A 
and C form three strands of a p-sheet 
and a short segment of a-helix within 
the core of the palm subdomain, which 
is structurally similar in RT, T7 RNAP 
and DNAP I. Two amino acids (Asp537/ 
AspS12 in T7 RNAP; Asp705/Asp882 in 
DNAP I; Aspll0/Aspl85 in HIV-l RT), 
which are identified as invariant within 
these motifs, are brought into alignment 
when the three polymerases are super- 
imposed. These two Asp residues bind 
and present two metal ions in the ap- 
propriate geometrical arrangement to 
catalyse a phosphoryi transfer reaction 
at the active site 12 . A third welkronserved 
carboxyiate (GIu883 in DNAP I; Asp 186 
in RT; absent in T7 RNAP) is also ex-, 
pected to be involved in catalytic metal 
binding. Significantly, mutation of this 
third carboxyiate reveals that it is less 
critical for activity than either of the 
two invariant Asp residues 13,14 . 

Motifs B and & are located in the fin- 
gers subdomains of DNAP I, T7 RNAP 
and RT, respectively. While motifs B and 
B' are dissimilar in sequence and struc- 
ture, and occur within a subdomain that 
is structurally dissimilar in the RNA- 
directed versus DNA-directed polym- 
erases, they are similarly positioned 
relative to the center of the active site 
in both classes of polymerases. In 
the structure of HIV-l RT complexed 
with primer-template, the fingers sub- 
domain and elements from motif B' con- 
tact the template strand 5 . Modeling, 
structural and mutation studies imply 
that a region in the corresponding pos- 
ition (Including elements of motif B) of 
the fingers subdomain of DNAP I or T7 
RNAP would be similarly involved in 
contacts with the template strand 15 . As 
the template in the RT primer-template 
structure does not extend downstream 
of the 3' end of the primer, downstream 
template contacts must be deduced 
from modeling. The more compelling 
model would place the downstream tem- 
plate contacts on 0-strands 3 and 4 of 
RT, the loop between these strands and 
(perhaps) the carboxy-terminal region 
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of p-strand I la (Ret. 16). However, it is 
also possible that these elements are in- 
volved in substrate contacts 1718 . The 
latter hypothesis would be consistent 
with recent evidence from structural 
and mutational analyses that the amino- 
terminal region of motif B (helix 0) in 
DNAP I interacts with the dNTP phos- 
phates and ribose moiety 19 - 20 . The 
docked dNTP modeled by Arnold and 
colleagues in the RT primer-template 
complex would not contact the fingers 
subdomain, but would instead establish 
contact with elements of motifs A and C 
and (possibly) with p-strand 11 (Ref. 16). 
It is, therefore, unclear if the fingers 
subdomain is involved in substrate as 
well as template strand contacts, or if 
this represents a case where analogous 
structures in different polymerases have 
different functions Q.e. a role in sub- 
strate binding for the fingers of DNAP I 
and T7 RNAP, but not for RT). 

Functional roles of the T/DxxGR motif and 
motif E. Irrespective of the question of 
substrate binding, it is clear that the 
fingers subdomains and elements of 
motifs B and B' are involved in template- 
strand binding in both the DNA-directed 
and RNA-directed polymerases, ft is, 
therefore, intriguing that structural 
similarity in the fingers subdomain and 
conservation of motifs B and B' reflect 
polymerase template specificity, In the 
same way, we can examine the location 
and proposed function of the T/DxxGR 
motif, which occurs in the DNA-directed 
polymerases, but not in the RNA- 
directed polymerases. Mutational stud- 
ies and modeling of template-DNAP I or 
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Figure 1 

Patterns of motif conservation in nucleic acid polymerases 2,10 - 11 . Residues in blue are invariant 
Other residues given are well conserved: h, hydrophobic residue; + ( positively charged residue; 

any residue; M a sequence gap. Residue numbers of invariant residues in DNA-directed DNA 
polymerase (DNAP) I, T7 DNA-directed RNA polymerase (RNAP) and HIV-i reverse transcriptase 
(RT) are: for T/DxxGR motif - DNAP I, Arg668; T7 RNAP, Arg425; for Motif A - DNAP I, Asp705; 
17 RNAP, Asp537; RT, AspllO; for Motif B - DNAP I, Lys758/Tyr766/Gly767; T7 RNAP, 
Lys631/Tyr639/Gly640; for Motif B' - RT, Glyl52; for Motif C - DNAP t, Asp882; T7 RNAP, 
Asp812; RT, Aspl85; for Motif D - RT, Lys220; and for Motif E - RT, Gly231. 
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tempIate-T7 RNAP structures based on 
the RT primer-template complex reveal 
that the structure formed by this motif 
is also involved in template-strand con- 
tacts 15 . Along similar lines, it may be 
noted that motif E, the only one of five 
motifs conserved in RNA-directed DNA 
polymerases that is not also conserved 
in the RNA-directed RNA polymerases, 
forms a structure designated the 'primer 
grip\ which is intimately associated 
with the primer strand 5 . 



(b) DNAP i pol domain 



Structural differences between T7 RNAP and 
DNAP I. It is intriguing that there is one 
region where DNAP I and RT are more 
similar to each other than to T7 RNAP, 
even though DNAP I and T7 RNAP show 
greater structural similarity overall. RT 
and DNAP 1 both exhibit a fourth 
p-strand (p-strand 14 in DNAP I; 11a, 
lib in RT), which extends the three- 
stranded sheet formed by motifs A and 
C. T7 RNAP lacks this fourth P-strand. 
As DNAP I and RT share substrate (and 
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Figure 2 

Structures of the polymerase domains of (a) P66 reverse transcriptase (RT) f (b) DNAdirected DNA polymerase (DNAP) I and (c) the complete T7 
DNA<Jirected RNA polymerase (RNAP) molecule 1 - 3-9 . The 'thumb' subdomains are colored green, the 'palm' subdomains are in red, and the 'fin- 
gers' subdomains are blue. Structural elements in T7 RNAP that have no counterpart in the DNAP I polymerase domain are colored light gray 
('Extrai', 'N-Term' domain, 'C-Term to Palm*) or orange ('Extra2'). The single magenta-colored helix in DNAP I and T7 RNAP is not formally con- 
sidered part of the polymerase subdomain, but is conserved between T7 RNAP and DNAP I. The two green-colored spheres mark the positions of 
the invariant Asp residues, which identity the center of the active site. 
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m°Z!f ^f! 1 ,^ 0main Str ^ reS ? f ,a) reverse transcriptase (RT), (b) DN/Wirected DNA polymerase (DNAP) I with the thumb' subdomain* 



product) specificity, one possibility is 
that this structural divergence between 
the DNAPs and the RNAP is related to 
substrate/product specificity. Such a 
speculation is supported by the pro- 
posal that the AZT-resistance mutations 
at Lys219 of RT p-strand 11a exert their 
effects directly through contact with 
the dNTP (Ref. 21). It is also supported 
by the observation that mutations of 
Phe882 at the carboxyi terminus of T7 
RNAP (which superimposes on Lys219 
of RT) increase rNTP (Ref. 22). 

Alternatively, this structural pattern 
might be related to the fact that T7 RNAP 
uses a double-stranded template while 
DNAP I and RT use partially single- 
stranded templates. A fourth p-strand 
positioned in T7 RNAP analogously to 
the p-strand seen in DNAP I and RT 
could clash sterically with the domain 
we call 'Extrai\ which is present in T7 
RNAP, but not DNAP I or RT, and which 
might be involved in unwinding. This 
fourth p-strand could also occlude a 
groove in T7 RNAP in which the un- 
wound non-template strand might bind. 
Truncation of the carboxyi terminus of 
T7 RNAP to remove the fourth P-strand 
might then reflect the need to remove 
structures that would sterically clash 
with the unwound non-template strand 
or with the protein domains respon- 
sible for unwinding. 
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Summary of motif strocture-4umrtion corre- 
lations. Within a superfamily that includes 
the DNA poi I class of enzymes, the 
phage and mitochondrial RNAPs, the 
majority of RNA-directed polymerases, 
and (perhaps) the poi a class of en- 
zymes, we can catalogue a set of corre- 
lations between conservation of struc- 
tural elements and the functions of 
those elements. Motifs A and C, and 
most of the palm subdomain, are con- 
served irrespective of polymerase- 
template or substrate specificity, reflect- 
ing a direct role for these structures in 
the activity common to all polym- 
erases: phosphodiester bond formation. 
Conservation of motifs T/DxxGR, B and 
the fingers domain in the DNA-directed 
polymerases, and conservation of motif 
B' and a distinct fingers domain in the 
RNA-directed polymerases reflects a 
role for these elements in template- 
strand binding. The unique position of 
Motif E, as the motif that is conserved 
only in the DNA-synthesizing polym- 
erases within the RNA-directed class of 
enzymes, reflects the role of this struc- 
tural element in product (i.e. primer) 
contacts. Finally, structural similarity in 
p-strands 14 and 11 of DNAP I and RT, 
respectively, and divergence with T7 
RNAP in the corresponding region 
might reflect a role for this structural 
element in substrate/product contacts 



or the structural requirements for utiliz- 
ation of a double-stranded template. 

Modularity In polymerase architecture 

One way to look at polymerase struc- 
ture is to imagine that polymerases are 
assembled from modules whose struc- 
tural conservation often reflects com- 
mon function, as illustrated in Fig. 4. 
The fingers, palm and thumb sub- 
domains together form the polymerase 
domain. The thumb subdomains, whose 
functions have not yet been addressed, 
are the least well conserved of these 
three subdomains, but appear to be 
similar in certain general features: they 
are extended, flexible, predominately 
a-helical structures that are involved in 
conferring processivity on polymeriz- 
ation by wrapping around bound tem- 
plate and/or interacting directly with 
the template to inhibit porymerase- 
template dissociation 0 " 26 . 

The function of the complete polym- 
erase domain appears limited to the 
minimal function of template-directed 
processive polymerization. Specific pol- 
ymerases can display additional activ- 
ities, but these activities appear to reside 
on distinct domains. Thus, T7 RNAP is 
capable of sequence-specific (promoter) 
DNA binding and template unwinding. It 
emerges that promoter-specific binding 
may be largely conferred on T7 RNAP 



996 jibs 21 - MAY 1996 REVIEWS 



IP 



5 re- 
and 
'the 
and 
aGR 
'e in 
rase 
3 by 
vere 
blue 
*een 
ated 
ame 
s of 
snd 
and 



iliz- 



*uc- 
are 
uc- 
>m- 
. 4. 
ub- 
ase 
Dse 
ed, 
ese 
be 
ley 
ely 
I in 
riz- 
in> 
ith 
se- 

m- 
Jie 
:ed 

K>1- 

tiv- 
ide 
' is 
er) 
'.It 
ing 
AP 




Figure 4 

A modular architecture for polymerases is suggested. The white lines indicate how different structural elements are linked to each other. The 
functions of these structural elements and their occurrence in different polymerases are indicated. 



by the 'Extra2' domain (Fig. 4), as 
amino acids critical for promoter recog- 
nition map to this loop 27 - 28 . Template- 
unwinding activity in T7 RNAP has not 
been mapped, but it is possible that the 
Extra2 domain plays a role in this pro- 
cess as it is placed at the leading end of 
the polymerase and forms two grooves 
into which the unwound strands of DNA 
might fit. The major accessory domains 
of each of these polymerases (RNAse H 
in RT; Exo proofreading in DNAP I; 
amino-terminal domain in T7 RNAP) are 
responsible for distinctive activities in 
each enzyme and bear no structural 
similarity to one another, although they 
do occupy roughly equivalent positions 
so that they can interact with nucleic 
acid strands upstream of the polym- 
erase active site. 

The scheme for polymerase organiz- 
ation presented in Fig. 4 suggests that 
gene fusion and recombination events 
might have played a role in the evolu- 
tion of modem multi domain polymerases 



from multi-subunit enzymes composed 
of more simple polypeptides. In this 
light, the structure of the middle RNAP 
of phage N4 is intriguing because it 
shows homology to the complete T7 
RNAP polymerase domain, but is com- 
posed of two subunits (designated P4 
and P7; L Rothman-Denes, pers. con*- 
num.). The break in the N4 RNAP places 
the amino terminal of the palm and 
thumb subdomains onto one subunit, 
and the conserved catalytic core and 
fingers structures on the other, Thus, in 
this instance, the subunit separation 
coincides with one of the subdomain 
divisions presented in Rg. 4. 

Structure and extent of this polymerase 
superfamily 

Using the alignments of Delarue et 
ai 2 and Poch et ai 10 as a guide, it is ex- 
pected that the pol I class of DNAPs and 
the phage and mitochondrial RNAPs 
will display polymerase domain struc- 
tures with palms and fingers similar to 



those shown in Fig. 4, while the struc- 
tures of the thumb subdomains might 
show variation within the range re- 
vealed in this figure. The alignment by 
Delarue et ai included the pol 0 class of 
DNA polymerases in its scheme for 
unification, but also suggested that the 
pol p class was exceptionally divergent 
because it was unique among the DNA- 
directed polymerases in lacking an 
identifiable motif B sequence. Analysis 
of the X-ray structure of pol p suggests 
that, in fact, this polymerase is not re- 
lated to DNAP i ( RT or T7 RNAP and 
might be more closely related to 
nucleotidyl transferases, which are not 
polymerases 29 . The pol a class of DNA 
polymerases is expected to display a 
palm like that of DNAP I or T7 RNAP, 
however, changes in the spacing be- 
tween motifs A, B and C imply at least 
some topological rearrangements in the 
fingers domain of the pol a class of 
polymerases relative to the pol I class. 
For the RNA-directed polymerases, 
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analysis of the spacing between motife 
also suggests that the palm domain 
structures will be relatively invariant, 
while there might be variations in the 
fingers-domain structures of the most 
distantly related classes. Alternatively, 
variations in motif spacing could be 
accommodated in 'extra' modules and 
variation in lengths of different elements 
on the periphery of these domains, 
which would leave the folding of their 
cores unchanged (as is seen, for example, 
in the comparison of the T7 RNAP and 
DNAP I structures). 



or substrates, or do they reflect contin- 
gent events in the evolutionary history 
of polymerases? To what degree can the 
different functionalities displayed by a 
polymerase like T7 RNAP be distinctly 
mapped to different structural modules? 
Studies aimed at answering these ques- 
tions and others might profit from a 
perspective that integrates the recog- 
nized structure-function relationships 
of nucleic acid polymerases. 



Status of the mufii-subunfl RNAPs 

Not included in the Delarue et al 
alignment were the multi-subunit 
RNAPs (prokaryotic RNAPs, eukaryotic 
RNAPs I, fl and IS). The identified se- 
quence similarities between the cata- 
lytic subunits of these RNAPs and 
polymerases of the RT-DNAP 1-T7 
RNAP superfamily 11 ' 30 ' 31 are not exten- 
sive enough to be conclusive without 
confirmation from structures at atomic 
resolution and, despite observations of 
similarity in the overall shape of the 
multi-subunit RNAPs and T7 RNAP 32 - 34 , 
it might be more profitable to compare 
the extensive mechanistic, rather than 
the structural, similarities of these en- 
zymes. These include similarities in the 
timing, triggering and conformational 
changes involved in the transition from 
the poorly ^rocessive initiation phase 
of transcription to the elongation 
phase 35 " 11 , and the utilization of similar 
promoter 42 - 43 and terminator se- 
quences 44 ^ 7 by both classes of polym- 
erases. The observation that the yeast 
RNAP holoenzyme is a dimer composed 
of a core enzyme that is homologous to 
T7 RNAP and a polypeptide that is 
functionally and structurally similar to 
the sigma subunit of Escherichia coli 
RNAP 48 , might also be relevant to 
understanding the relationship between 
the single subunit and multi-subunit 
RNAPs. 



Work in the author's laboratory is 
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Concluding remarks 

The significance of the observed 
pattern of structural similarities be- 
tween nucleic acid polymerases, the 
correlations between structural similar- 
ity and the functional role of different 
structural elements, and the mecha- 
nistic similarities between the multi- 
subunit RNAPs and simpler polymerases 
like T7 RNAP have yet to be fully 
explored. Do the observed patterns 
reflect strict functional requirements 
for the utilization of different templates 
190 
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